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N.  ARSTRACT  ICaniAwa  an  tmmmtmm  atAa  II  naaaaaarr  aN  MmhUF  Sf  Naat  nyAaaJ 

>  A  statistical  technique  has  been  developed  for  the  validation  of  a  Monte  Carlo  simulation 
process  or  other  processes  whose  results  can  be  reduced  to  a  finite  time  sequence  of  equally  spaced 
events  with  dichotomous  outcomes.  Essential  to  the  technique  is  the  Bahadur Lazarsfeld  repre¬ 
sentation  of  the  probability  distribution  of  the  populations  consisting  of  all  binary  vectors  with  a 
specific  number  of  elements.  This  paper  analyses  the  properties  of  the  test  and  the  adequacy  of  the 
Bahadur- Lazars feld  representation  for  practical  application  purposes.  It  examines  the  probability  of 
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20.  ABSTRACT  (Continued) 

y  rejection  which  results  when  the  significance  level  0t)is  specified  and  computes  the  power  of  the 
test  when  the  null  hypothesis  is  tested  against  known  alternatives.  The  evidence  collected  suggests 
that  the  statistical  test  is  an  adequate  tool  for  the  purposes  for  which  it  was  designed. 
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EVALUATING  THE  VALIDATION  OF  A  MONTE  CARLO 
SIMULATION  OF  BINARY  TIME  SERIES 


INTRODUCTION 

This  report  describes  the  results  of  an  effort  to  characterize  the  properties  of  a  statistical  test 
developed  by  the  Naval  Research  Laboratory  (NRL)  to  validate  the  Antisubmarine  Warfare  Program 
Surveillance  Model  (APSURV)  Mod  1.4  simulation  model.  The  APSURV  digital  computer  simulation 
models  have  been  the  Navy’s  approved  undersea  surveillance  models  since  their  initial  development. 
They  have  been  used  to  predict  the  performance  of  the  Sound  Surveillance  System  (SOSUS).  The  vali¬ 
dation  effort  was  one  of  the  first  attempts  to  assess  the  adequacy  of  the  APSURV  models  in  represent¬ 
ing  the  detection  performance  of  the  SOSUS.  This  paper  presents  some  preliminary  results  in  evaluat¬ 
ing  the  validation  method  itself.  Its  focus  is  on  the  properties  of  a  statistical  test  developed  by  Johnson 
and  Wiener  [1]  designed  to  conduct  the  validation  effort.  Although  the  evaluation  has  been  limited  by 
the  large  amount  of  computer  time  required  to  perform  a  complete  analysis,  the  available  evidence 
confirms  the  adequacy  of  the  validation  procedure  and  opens  up  the  topic  as  an  area  for  further 
research. 

The  term  validation  means  determining  whether,  within  the  degree  of  assurance  of  statistical 
tests,  the  simulation  represents  the  process  being  modeled.  In  [1]  a  quantitative  procedure  for  a  simu¬ 
lation  model  validation  has  been  adopted  and  the  required  statistical  tests  developed  to  conduct  the 
task.  The  validation  technique  developed  can  be  briefly  described  as  follows.  Each  Monte  Carlo  repli¬ 
cation  of  the  simulation  model  for  any  one  sensor  produces  a  vector  of  m  binary  elements.  Based  on  a 
sample  of  these  binary  vectors,  a  representation  is  obtained  of  the  probability  distribution  of  the  popula¬ 
tion  of  binary  vectors  from  which  the  sample  was  drawn.  Using  this  representation  the  likelihood  of 
any  vector  of  m  binary  elements  may  be  computed  under  the  hypothesis  that  it  comes  from  the  same 
statistical  population  as  the  vectors  generated  by  the  simulation  model.  In  particular  the  method  com¬ 
putes  the  likelihood  of  the  binary  vector  resulting  from  an  observed  run  of  the  actual  process  under  the 
same  conditions  that  are  represented  in  the  simulation.  The  question  of  the  validity  of  the  simulation 
model,  at  the  significance  level  o,  is  then  resolved  by  observing  whether  the  likelihood  of  the  experi¬ 
mentally  obtained  vector  exceeds  the  a'h  percentile  of  the  likelihoods  of  the  simulation  generated  vec¬ 
tors.  The  statistical  technique  is  appropriate  for  the  validation  of  a  Monte  Carlo  simulation  of  processes 
such  as  detection  processes,  whose  results  can  be  reduced  to  a  finite  sequence  of  thresholding  events 
having  dichotomous,  i.e.,  binary,  outcomes.  Moreover,  the  validation  technique  can  be  applied  even 
when  the  process  being  simulated  cannot  be  experimentally  repeated.  Thus  the  simulation’s  validity  for 
the  case  examined  can  be  determined,  to  within  a  specified  level  of  statistical  significance,  with  only  a 
single  observation  of  the  real-world  process  being  simulated.  The  statistical  technique  is  nonparametric, 
it  does  not  assume  independence  between  events  occurring  at  different  times,  and  it  does  not  require 
the  assumption  of  any  stationarity  or  steady  state  behavior  of  the  process  simulated.  The  simulation 
validation  procedure  can  be  summarized  as  follows. 

As  a  particular  target  traverses  the  surveillance  zone  it  generates  a  track  history  of  detections  for 
each  sensor  in  the  zone.  For  each  sensor  /  there  is  an  observed  vector  x,  “  {x»,  ,  km)  of  0’s  and 

l’s  from  an  unknown  probability  distribution  p’t  and  the  simulation  generates  n  vectors  g  — 
(xi,  ....  xm)  of  0’s  and  l’s  from  a  probability  distribution  p,.  Using  the  n  generated  vectors  the  sta¬ 
tistical  technique  obtains  an  estimate  p,  of  p,.  The  simulated  data  are  then  applied  to  A  to  obtain  the 
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sample  distribution  of  the  n  vectors  as  an  approximation  to  the  population  distribution.  The  test  con¬ 
sists  of  determining  whether  the  observed  vector  x,  has  /Rvalue  in  the  upper  1  -  a  region  of  the  sam¬ 
ple  distribution.  If  it  does  then  the  hypothesis  that  x ,  comes  from  the  distribution  p,  is  accepted.  The 
test  is  applied  for  all  /  and  many  acceptances  that  x,  is  from  p,  will  confirm  the  null  hypothesis  that  p,  is 
a  good  approximation  to  />«,  thereby  validating  the  simulation.  It  is  to  be  expected  that  due  to  statistical 
fluctuation  some  sensors  will  fail  the  test.  Hence  a  distinction  must  be  made  between  validation  of  the 
simulation  model  in  general  and  specific  statements  about  the  simulation  of  the  individual  sensors. 
Statements  about  the  latter  are  simply  understood  to  carry  the  uncertainty  inherent  in  the  statistical  test 
itself. 


This  paper  examines  the  properties  of  the  statistical  test  applied  to  validate  the  simulation.  Of 
immediate  interest  is  the  question  of  whether  specifying  the  a-th  percentile  of  the  sample  distribution 
does  result  in  a  probability  of  false  rejection  equal  to  a.  In  addition  there  are  such  concerns  as  to  how 
one  should  deal  with  alternative  hypotheses  and  what  kind  of  power  the  test  exhibits  in  evaluating  alter¬ 
natives.  Partial  answers  have  been  obtained  to  some  of  these  questions  and  enough  information  has 
been  learned  to  establish  the  adequacy  of  the  test  procedure.  A  full  analysis  of  the  properties  of  the 
test  remains  for  further  investigation. 

Following  the  introduction  in  this  report  is  a  section  on  the  mRjor  conclusions  reached.  It  is  fol¬ 
lowed  by  a  brief  statement  on  the  validation  method  that  was  evaluated.  The  next  two  sections  consist 
of  a  Monte  Carlo  study  on  the  validation  method  and  an  analytical  approach.  Finally  there  is  a  sum¬ 
mary  and  a  series  of  appendices  presenting  details,  as  necessary,  on  the  validation  method  and  technical 
support  material  on  the  evaluation  methods. 

MAJOR  FINDINGS 

Each  Monte  Carlo  replication  of  the  simulation  model  for  each  sensor  produces  a  vector  of  m 
binary  elements.  When  the  number  of  elements  is  small,  say  m  —  2  or  3,  the  statistical  test  at  a 
significant  level  a  does  not  result  in  a  probability  of  false  rejection  equal  to  a.  As  m  increases,  for  a 
small,  the  probability  of  false  rejection  approaches  a  from  below  and  it  is  practically  a  for  m  —  5  or  6. 

Both  a  Monte  Carlo  evaluation  and  an  analytical  approach  were  taken  to  examine  the  power  of  the 
test.  In  the  former  two  alternative  hypothesis  were  tested  against  the  same  null  hypothesis.  Both  alter¬ 
natives  yielded  high  power  and  the  more  dissimilar  the  alternative  to  the  null  hypothesis  the  higher  the 
power  of  the  test.  In  the  analytical  approach  a  simple  structure  was  considered  that  allowed  one  to 
attain  power  curves  for  the  cases  m  —  1  and  m  —  2.  These  cases  shed  some  light  into  the  behavior  of 
the  power  function  in  general.  Depending  on  the  null  hypothesis  the  range  of  possible  values  of  the 
power  of  the  test  tends  towards  higher  power  as  the  two  distribution  become  more  dissimilar.  This  is 
not  to  say  that  minimum  power  is  achieved  whenever  the  null  and  alternative  hypothesis  coincide.  It 
simply  states  that  this  is  the  case  for  the  longer  portion  of  the  range  of  possible  values  of  the  alternative 
distribution.  The  number  of  exceptions  gets  small  as  a  gets  small  and  for  small  a  in  general  the  excep¬ 
tions  occur  within  a  range  that  remains  relatively  close  to  the  null  hypothesis  anyway.  This  analysis  will 
be  discussed  later  in  more  detail.  Both  approaches  suggest  however  that  the  test  performs  adequately 
with  respect  to  the  power  of  the  test. 

The  statistical  test  consists  of  testing  the  null  hypothesis  that  the  observed  vector  is  a  sample  from 
the  population  from  which  the  n  vectors  produced  by  the  simulation  were  generated.  In  so  doing  one 
specifies  a  significance  level  a,  in  this  case  the  a*  percentile  of  the  sample  distribution  obtained  from  n 
replications  of  the  simulation.  If  the  likelihood  of  the  observed  vector  falls  below  the  o'*  percentile  of 
the  likelihoods  of  the  simulation  generated  vectors  the  null  hypothesis  is  rejected. 
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In  order  to  conduct  the  evaluation  of  the  teat  a  Monte  Carlo  approach  was  first  taken.  Sample 
distributions  were  obtained  from  simple  known  distributions.  This  was  done  by  repeatedly  generating  n 
vectors  as  if  by  the  simulation  and  then  generating  from  the  same  distribution  an  n  +  1  si  or  observed 
vector.  In  each  instance  the  test  was  conducted.  By  repeating  the  test  20  times  a  proportion  of  rejec¬ 
tions  was  computed.  By  further  repetitions  in  blocks  of  20  repetitions  each,  a  sequence  of  independent, 
identically  distributed  random  variables  was  generated.  The  Central  Limit  Theorem  then  allowed  one 
to  obtain  confidence  intervals  on  the  resulting  mean  proportion  of  rejections,  which  should  approximate 
the  resulting  probability  of  rejection  when  applying  the  test.  The  evidence  gathered  supports  the  con¬ 
jecture  that  as  the  vector  length  m  increases  the  probability  of  false  rejection  begins  to  approach  a.  For 
the  cases  considered  the  results  indicate  that  the  probability  of  rejection  is  actually  less  than  a  for  m  «■ 
2  or  3,  but  it  approaches  a  from  below  as  m  increases  and  is  practically  a  for  m  -  5  and  6.  By  varying 
the  above  approach  slightly  the  n  +  1  u  vector  was  then  generated  from  an  alternate  known  distribution 
different  from  the  distribution  of  the  first  n  vectors.  The  resultant  proportion  of  rejections  now  esti¬ 
mates  the  power  of  the  test.  For  the  cases  considered  the  test  appears  to  discriminate  very  well. 

To  obtain  the  representation  of  the  distribution  of  the  population  from  which  the  it  simulated  vec¬ 
tors  come,  it  is  necessary  to  estimate  the  function  fix)  which  measures  the  correlation  effects  among 
elements  of  the  vector  x  Appendix  A  specifies  the  functional  form  of  fix).  An  important  conse¬ 
quence  of  the  evaluation  is  the  fact  that  it  is  necessary  to  pay  attention  to  the  vector  sample  size  n  to 
ascertain  when  the  estimate  fix)  is  close  enough  to  its  theoretical  value  fix)  for  the  representation  of 
the  distribution  in  question  to  be  adequate. 

VALIDATION  METHOD 

The  simulation  model  in  question,  APSURV  Mod  1.4,  was  designed  to  estimate  the  performance 
of  an  acoustic  sensor  in  the  detection  of  a  target  in  the  ocean.  Specific  details  of  the  validation  metho¬ 
dology  for  the  simulation  are  given  in  Appendix  A  of  this  paper.  The  simulation  is  essentially  baaed  on 
Monte  Carlo  replications  of  time-phased  detect/no  detect  events  and  the  validation  considers  a  real- 
world  observed  sequence  of  detect/no  detect  decisions  made  by  the  acoustic  sensor.  Table  1  illustrates 
the  structure  of  the  validation  data.  The  simulation  generates  n  replications  of  vectors  with  m  elements 
and  the  test  determines  whether  the  random  process  generating  the  n  x  m  matrix  is  equivalent  to  the 
random  process  generating  the  (n  4-  l)-st  or  observed  vector.  The  Method  of  Bahadur  [2]  and  Lazars- 
fold  [3]  is  used  to  obtain  the  representation  of  the  probability  distribution  of  the  population  from  which 
the  n  simulation  replications  come.  From  this  is  obtained  for  each  vector  x  -  (x,.  ....  xm)  a  likeli¬ 
hood  value 


pis)  -  P m(*)  •  fix) 

where  Pnife)  **  the  probability  of  vector  x  under  the  assumption  of  independent  vector  elements  and 
fig)  is  a  function  which  represents  the  degree  of  correlation  along  the  time  stream. 

For  an  a-level  test  one  checks  whether  the  likelihood  of  the  observed  vector  pig„)  is  above  the 
first  a  *  a  likelihoods  from  the  simulated  set.  If  so,  the  hypothesis  that  x,  is  a  member  of  the  same 
random  process  generating  the  *  replications  is  not  rejected. 

A  MONTE  CARLO  EVALUATION  Of  THE  TEST 

Of  the  various  properties  of  the  test  one  is  interested  in  evaluating,  the  most  immediate  concern 
is  whether  specifying  the  a-th  percentile  of  the  sample  distribution  does  result  in  a  probability  of  false 
rejection  equal  to  «.  In  addition  there  are  such  concerns  as  to  how  one  should  deal  with  alternative 
hypotheaes  and  what  kind  of  power  the  test  exhibits  in  evaluating  alternatives.  Another  important  fac¬ 
tor  hi  the  adequacy  of  the  estimate  of  the  probability  distribution  of  the  it  simulated  replications  of  a 
track  history  of  detections.  Of  particular  interest  is  whether  the  number  of  replications  is  sufficient  for 
the  estimate  of  the  correlation  function  fig)  to  settle  about  its  theoretical  value. 
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Table  1  —  Structure  of  the  Validation  Data 

240-HOUR  TRACK,  SO  MODEL  REPLICATIONS 
1:  DETECTING  THE  TARGET, 

0:  NOT  DETECTING  TARGET 


Elapsed  Time  (hours) _ 1  2  3  4  5  6  ...  239  240 

Ote^^^ionHiswOOlllO^ri 


Model 

Replication 

1 

0  0  I  0  0  0  ...  1  1 

2 

0  1  0  0  0  0  ...  0  I 

3 

0  0  1  1  0  0  ...  0  1 

50 

0  0  1  0  0  0  ...  0  0 

Predicted 


Detection  Probability  0.00  .20  .60  .20  .00  .00  . . .  .00  .60 


Under  the  assumption  that  the  simulation  model  is  valid,  the  statistical  test  for  a  specific  sensor- 
track  combination  would  normally  constitute  a  Bernoulli  trial  with  rejection  probability  a  and  probability 
of  no  rejection  1  —  a.  Here  the  sample  distribution  (of  the  n  replications)  is  used  as  an  approximation 
to  the  theoretical  distribution;  hence  the  actual  rejection  probability  may  appear  to  be  data-driven. 

However  repetitions  of  the  test  itself  under  identical  conditions  are  identically  distributed  and  should 
generate  a  proportion  of  rejections  approximating  the  probability  of  rejection  as  the  number  of  repeti¬ 
tions  increases.  In  general  there  are  n  +  1  vectors  each  of  length  m.  Ideally  the  probability  of  rejection 
should  equal  a  when  the  n  +  1st  vector  is  from  the  same  distribution  as  the  first  n,  and  it  should  equal 
the  power  of  the  test  against  a  specified  alternative  distribution  when  the  n  +  1-st  vector  is  from  the 
alternative  distribution.  To  make  inferences  about  the  probability  of  rejection  successive  repetitions  of 
the  same  test  may  be  generated.  The  repetitions  should  be  grouped  into  subsets  of  equal  size  from 
each  of  which  a  proportion  of  rejections  may  be  computed.  The  computed  proportions  are  a  sequence 
of  independent,  identically  distributed  random  variables.  Taking  a  large  enough  number  of  subsets  one 
can  then  appeal  to  the  Central  Limit  Theorem  and  use  classical  statistical  techniques  to  make  inferences 
on  the  probability  of  rejection  that  results  when  applying  the  given  test  at  a  specific  level  a. 

A  preliminary  evaluation  of  the  properties  of  the  test  has  been  conducted  with  the  use  of  Monte  i 

Carlo  methods.  It  was  basically  intended  to  determine  an  estimate  of  the  level  of  significance  obtained 
in  the  test.  The  approach  taken  used  sets  of  n  +  1  random  binary  vectors  which  were  repeatedly  gen¬ 
erated  from  a  known  distribution.  Several  cases  were  considered  by  varying  the  vector  length  m  and  by  ' 

varying  n,  the  number  of  replications,  for  each  m.  The  vector  elements  in  all  cases  were  independent 
Bernoulli  random  variables  where  the  probability  of  a  1  varied  by  vector  element.  The  various  cases  of 
m  considered  were  2,  3,  4,  5  and  6  vector  elements.  For  the  case  m  -  6  the  probabilities  of  a  1  for  all 
vector  elements  were  (.2,  .4,  .6,  .8,  .8,  .8)  respectively.  Similarly  one  used  for  m  —  5  the  probability 
vector  (.2,  .4,  .6,  .8,  .8),  for  m  -  4  (.2,  .4,  .6,  .8),  for  m  -  3  (.2,  .4,  .6)  and  for  m  -  2  (.2,  .4).  For 
all  of  these  distributions  the  correlation  function  has  a  theoretical  value  of  /(x)  -  1.  In  a  typical  case, 
for  example  m  -  4  and  n  —  50  with  a  random  number  routine,  51  vectors  of  length  4  are  generated 
and  the  test  is  applied  to  determine  whether  the  5 1  '*  vector  does  or  doesnot  belong  to  the  same  popula¬ 
tion  from  which  the  first  50  came.  The  decision  is  made  whether  or  not  to  reject  such  a  hypothesis. 

This  procedure  is  repeated  20  times  and  the  proportion  of  rejections  is  computed.  The  sets  of  size  20  \ 

are  replicated  100  times.  The  one  hundred  computed  proportion  of  rejections  are  then  treated  as  100 
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independent,  identically  distributed  random  variables  whose  mean  is  an  estimate  or  the  probability  of 
false  rejection  resulting  from  the  application  of  the  test.  The  test  itself  consists  of  computing  the  likeli¬ 
hood  q  of  the  n  +  1-st  vector  and  the  set  of  likelihoods  [p„  -  p  ix„) :  g  -  1,  ....  /»}.  The  test  pro¬ 
cedure  is  to  reject  the  hypothesis  of  association  if  the  observed  value  q  falls  below  the  ar-th  percentile  of 
computed  values  pM.  Define  N  to  be  the  number  of  elements  in  the  set  {g:p,  <  q.  1  <  g  <  «)■  If  N 
>  n  o,  the  hypothesis  of  association  is  not  rejected.  The  hypothesis  is  rejected  at  the  significance 
level  a  if  N  <  n  •  o. 

In  this  evaluation  the  value  of  o  was  fixed  at  a  ™  0.30  and  for  each  value  of  m  the  number  of 
replications  (i.e.,  values  of  n)  used  were  from  the  set  (20,  SO,  100,  200,  500,  and  800).  Not  all  values 
of  n  were  used  for  all  values  of  m.  Only  for  m  *■  5  were  all  values  of  n  used.  In  addition  as  m  and  n 
became  larger  both  the  number  of  repetitions  in  each  subset  from  which  the  proportion  of  rejections 
were  computed  and  the  number  of  subsets  itself  were  decreased  due  to  limitations  in  computer  time 
and  cost. 

For  the  cases  considered  the  significance  level  achieved  appears  to  approach  a  from  below  as  m, 
the  number  of  vector  elements  increases. 

The  analysis  was  also  conducted  to  estimate  the  power  of  the  test  when  the  null  hypothesis  was 
tested  against  a  specified  alternative.  For  the  case  m  —  5  tests  were  also  conducted  where  the  n  4-  1-st 
vector  was  generated  from  a  known  distribution  other  than  the  distribution  from  which  the  first  n  vec¬ 
tors  were  generated.  These  alternate  distributions  consisted  of  independent,  identically  distributed  vec¬ 
tor  elements  where  the  probability  of  a  1  was  given  by  p  —  .1  in  one  case  and  p  ~  .5  in  another.  The 
theoretical  value  of  the  correlation  function  in  this  case  is  also  fix)  —  1.  The  proportion  of  rejections 
in  this  case  estimates  the  power  of  the  test,  that  is,  the  probability  that  the  null  hypothesis  is  correctly 
rejected.  When  comparing  both  alternatives  to  the  null  hypothesis  the  test  appears  to  discriminate  very 
well  and  the  more  dissimilar  the  alternative  to  the  null  hypothesis  the  higher  the  power  of  the  test. 

An  interesting  problem  revealed  in  the  evaluation  is  that  vectors  with  negative  likelihoods  arise  as 
a  function  of  the  fluctuation  of  computed  values  of  fix )  around  its  theoretical  value.  In  the  cases  con¬ 
sidered  the  theoretical  value  is  fix)  —  1.  As  n  increases  the  distribution  of  the  n  values  fix)  tends 
towards  a  spike  at  1  and  the  number  of  vectors  resulting  in  negative  likelihoods  (because  fix)  <  0) 
decreases  or  disappears.  To  assess  the  effects  of  the  estimates  /  tests  were  performed  first  using  the 
estimates  /and  then  replacing  these  estimates  with  the  actual  theoretical  values  fix)  -  1. 

The  data  were  collected  by  cases  of  m,  n,  and  either  fix)  -  /  or  fix)  —  1.  For  each  case 
labeled  im,  n,  J)  there  corresponds  a  case  im,  n,  1).  To  obtain  an  overall  measure  of  the  resulting  pro¬ 
bability  of  false  rejection  when  the  test  was  applied  at  a  significance  level  a  for  each  m,  the  resulting 
estimates  a  were  averaged  over  all  values  of  n  used  for  that  specific  m.  These  resulting  averages  can  be 
labeled  am  -r  and  5m  t.  To  compute  the  am-r  only  those  values  of  n  were  included  for  which  the  esti¬ 
mates  /  were  considered  adequate  based  on  a  comparison  with  fix)  -  1  cases.  Figure  1  plots  the 
overall  average  probability  of  false  rejection  resulting  for  each  m.  For  the  cases  considered  the  results 
indicate  that  the  probability  of  rejection  is  actually  less  than  a  for  m  «*  2  or  3  but  it  increases  to  a  as  m 
increases  and  is  practically  a  for  m  “  5  and  6. 

In  addition,  in  these  cases  where  the  theoretical  correlation  function  is  fix)  —  1  some  attention 
must  bea  paid  to  having  a  large  enough  number  of  replications  in)  to  make  certain  that  the  computed 
values  fix)  are  an  appropriate  representation  of  the  theoretical  correlation  function.  An  alternative 
would  be  to  find  a  better  estimator  for  fix). 

The  conclusions  of  these  evaluations  cannot  be  accepted  as  general  because  the  analysis  was  lim¬ 
ited  to  a  few  known  distributions  of  simple  structure.  Enough  evidence  was  collected,  however,  to 
establish  confidence  in  the  adequacy  of  the  procedure  developed  by  Johnson  and  Wiener  (1). 
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Appendix  B  contains  more  detailed  information  on  the  Monte  Carlo  evaluation  including  the  data 
collected  and  tables  and  figures  representing  the  information  gathered  from  the  evaluation. 

AN  ANALYTICAL  APPROACH 

The  statistical  test  consists  of  computing  the  likelihood  q  of  the  n  +  1-st  vector  x0  and  the  set  of 
likelihoods  [pg  -  pixg):g  —  1,  ....  n},  one  likelihood  for  each  of  the  n  replication  vectors 
[xg:  g  -  1,  . . .  ,  n).  The  test  procedure  is  to  reject  the  hypothesis  of  association  if  the  observed  value  q 
falls  below  the  a-th  percentile  of  the  computed  values  pg.  Define  N  to  be  the  number  of  elements  in 
the  set  {g:  pg  <  q,  1  <  g  <  n).  If  N  >  n  •  a  the  hypothesis  of  association  is  not  rejected.  The 
hypothesis  is  rejected  at  the  significance  level  a  if  N  <  n  ■  a. 


As  discussed  in  appendix  A,  the  likelihood  of  any  vector  x  is  given  by  its  Bahadur  representation. 

pix)  -  Pm(x)  •  /(x> 

where  P[\\ix)  is  the  probability  of  vector  x  under  the  assumption  of  independent  vector  elements  and 
fix)  is  a  function  which  represents  the  degree  of  correlation  between  vector  elements.  In  this  section 
only  the  simplest  cases  will  be  considered.  The  n  x  m  data  matrix  will  consist  of  n  independent  vectors 
each  of  length  m  where  the  entries  xu  are  independent,  identically  distributed  Bernoulli  random  vari¬ 
ables  with  probability  of  a  1  given  by  v.  In  this  case  for  any  vector  x,  fix)  —  1,  and  hence 
pix)  —  p[i](x).  Now  let  £(•)  denote  expected  value,  then 

v  —  Eixu)  i  —  1,  . . .  ,  n,  j  —  1 . m 

where  it  is  assumed  0  <  v  <  1 .  In  this  case  maximum  likelihood  estimates  are  used  to  obtain  i>.  Since 
this  is  a  simplification  of  a  slightly  more  general  representation  the  estimates  are  obtained  from  the 
columns  of  the  n  x  m  matrix. 


“  II  V*  0  “  g  “  1.  2 . n 

/-i 


where 


1/2  n  if  £x„-0 
*-i 


V;  “ 


1  -  (l/2n)  if  b 

n 

il/n)  Y.  otherwise  for  i  —  1,  2, 

1 


,  m 


Notice  first  that  v,  -  vy  in  our  case  for  all  /,  j  -  1 . m  and  secondly  since  the  v,’s  are  assumed  to 

be  neither  0  nor  1  a  reasonable  correction  is  made  should  the  data  seem  to  indicate  they  are  when 
estimating  them.  Even  though  v,  -  vy  for  all  i,  J  —  1 ,  ....  m  it  is  likely  that  v,  &  vt  for  most  i,  j 
pairs. 


The  particular  data  structure  considered  in  this  section  shows  that  it  may  be  the  case  that  the  pro¬ 
bability  of  false  rejection  when  applying  the  test  approaches  the  significance  level  a  as  m,  the  vector 
length,  increases.  For  any  vector  length  m  there  are  2m  possible  binary  outcomes  or  members  of  the 
population  of  vectors  of  length  m.  The  analysis  consists  of  letting  n  -  2"  where  all  binary  vector  ele¬ 
ments  consist  of  independent,  identically  distributed  Bernoulli  random  variables  with  the  probability  of 
a  1  given  by  v.  There  are  (2n)2"  possible  outcomes  from  which  the  Bahadur  representation  may  be 
obtained.  To  each  of  these  outcomes  one  may  associate  2m  possible  n  -1-  1-st  or  observed  vectors  yield¬ 
ing  a  total  of  (2m)2"+l  elementary  outcomes.  Consider  first  the  case  where  the  observed  vectors  come 
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from  the  same  distribution  as  the  sets  of  first  n  -  2m  vectors.  Specifying  a,  evaluating  N  for  each  out¬ 
come,  and  applying  the  rules  of  the  test  one  may  establish  an  association  between  a  and  the  actual  pro¬ 
bability  of  false  rejection.  Table  2  lists  all  possible  outcomes  when  performing  the  test  for  the  case 
m  -  1.  This  table  shows  the  4  cases  of " simulated1  vectors  that  are  possible,  the  estimates  v  of  v  that 
each  case  generates,  the  computed  vector  likelihoods  pg  for  each  case,  the  likelihoods  g  for  each  of  the 
possible  "observed"  vectors,  and  the  value  of  N  for  each  combination  of  simulated  vectors  and  observed 
vector.  Here  n  -  2m-  2,  (2m)2”  —  4  and  (2m)2m+l  -  8.  In  this  case,  the  probability  of  false  rejection 
is  independent  of  a  specified  value  of  a  since  N  assumes  only  two  values  either  0  or  2.  For  any 
at  (0, 1)  the  null  hypothesis  is  rejected  only  if  N  -  0;  this  occurs  only  whenever  xi  -  x2  -  1  and 
x0  -  0  with  probability  v2(l  -  v)  or  if  x\  -  x2  -  0  and  xq  -  1  with  probability  (1  -  vrv.  Hence,  for 
any  a  e  (0,  1]  the  resulting  probability  of  rejection  is  given  by  p,(v)  —  v2(l  -  v)  +  (1  -  v)2  v  — 
w (1  -  v),  depending  only  on  v.  The  function  p,(v)  has  domain  [0,  11,  is  concave,  and  is  symmetric 
around  the  point  v  *  1/2  where  it  achieves  its  maximum  of  1/4.  If  one  considers  next  the  case  where 
the  observed  vector  comes  from  a  different  distribution,  say  a  Bernoulli  random  variable  with  the  pro¬ 
bability  of  a  1  given  by  t ,  then  the  power  of  the  test  is  also  independent  of  a  and  is  given  by 
1  -  p  -  v2(l  —  /)  +  (1  —  v)2/,  where  P  is  the  probability  of  a  type  II  error  (false  acceptance).  The 
power  of  the  test  depends  only  on  the  values  of  v  and  t.  Figure  2(a)  shows  p,(v)  as  a  function  of  v  and 
Fig.  2(b)  shows  1  — P  as  a  function  of  t  for  varius  values  of  v.  For  each  value  v0  the  function  1  —  p  is  a 
straight  line  in  t  with  y  intercept  vrf  and  slope  (1  -  2v0).  Figure  2  illustrates  the  behavior  of  the  power 
function.  When  v  <  1/2,  1-/3  increases  as  t  increases  for  t  >  v  and  1  -  p  decreases  as  t  decreases 
for  t  <  v  hence  the  increase  in  power  occurs  as  t  and  v  become  more  dissimilar  for  the  larger  portion  of 
the  range  of  i.  When  v  >  1/2.  1  -  p  increases  as  t  decreases  for  t  <  v  and  1  -  p  decreases  as  r 
increases  for  t  >  v  hence  the  increase  in  power  occurs  as  t  and  v  become  more  dissimilar  for  the  larger 
portion  of  the  range  of  t.  In  either  case  as  v  becomes  small  or  large  {pr(v )  small)  the  range  of  /  for 
which  power  decreases  as  /  and  v  become  more  dissimilar  is  very  small.  Naturally  for  v  —  1/2  (all  vec¬ 
tors  equally  likely)  1  -  p  »  pr(v)  -  1/4  for  all  values  of  /,  that  is,  the  power  function  is  a  constant 
equaling  the  probability  of  false  rejection  for  all  values  of  t. 


Table  2  —  Test  Outcomes  for  Case  m  —  1 
Specified  Data  Structure 


Simulated 

vectors 

x* 

Estimated 
value  of  v 

Likelihood  of 
vector  xg 

P„(x) 

Likelihood  of 
obseved 
vector  x„ 

Q 

Number  of 
cases  less 
than  q 
\ 

g  -  1 

g  -  2 

V 

g  “  1 

g  -  2 

x„  -  0 

x„  -  1 

x„  -  0 

x0  -  1 

1 

1 

3/4 

3/4 

3/4 

1/4 

3/4 

0 

2 

1 

0 

1/2 

1/2 

1/2 

1/2 

1/2 

2 

2 

0 

1 

1/2 

/ 1 2 

1/2 

1/2 

1/2 

2 

2 

0 

0 

1/4 

3/4 

3/4 

3/4 

1/4 

2 

0 

For  larger  values  of  m  the  number  of  elementary  outcomes  to  consider  increases  very  rapidly  yet 
one  can  discern  a  pattern  from  the  cases  m  —  1,2,  and  3.  Table  1  listed  all  possible  outcomes  for  the 
case  m  —  1.  Similarly  one  may  list  all  possible  outcomes  for  the  case  m  -  2.  There  are  1024  of  these. 
Following  the  rules  of  the  test  one  may  evaluate  the  value  of  N  for  each  outcome.  N  assumes  only  the 
integer  values  0,  1,  2,  and  4.  For  the  case  m  -  3  enough  outcomes  were  listed  until  it  became  obvious 
that  N  assumes  only  the  integer  values  0,  1,2,  3,  4,  5,  6,  and  8.  The  integral  scale  suggests  a  partition 
of  (0,  1)  into  overlapping  subintervals  where  a  may  assume  its  values.  For  the  case  m  -  2  the  test 
indicates  that  for  a  t  (0,  1/4]  one  rejects  the  null  hypothesis  if  and  only  if  N  -  0,  for  a  c  (1/4,  1/21 
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Tesi  characteristics  for  case  m  *■  I.  (a)  Probability  of  false  rejection. 
<b)  Power  of  the  test  for  various  values  of  v. 
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one  rejects  if  and  only  if  N  <  1  (N  —  0  or  1)  and  for  a  e  (1/2,  1)  one  rejects  if  and  only  if  M  <  2 
(A/  *•  0,  1  or  2).  The  case  a  -  1  is  not  realistic  and  should  be  ignored  as  a  trivality.  For  m  -  2  one 
considers  their  3  subintervals  of  (0,  1)  where  a  may  assume  its  values.  By  listing  the  1024  elementary 
outcomes  and  their  corresponding  values  of  N  one  may  single  out  those  events  that  correspond  to  a 
rejection  of  the  null  hypothesis  for  each  of  the  3  subintervals  where  a  assumes  its  values.  The  events 
for  which  N  —  0  are  rejection  events  for  a  t  (0,  1/41,  the  events  for  which  N  -  0  or  1  are  rejection 
events  for  a  t  (1/4,  1/2)  and  the  events  for  which  N  -  0,  1  or  2  are  rejection  events  for  a  e  (1/2,  1). 
One  associates  to  each  event  its  corresponding  probability  and  add  the  event  probabilites  corresponding 
to  each  subinterval  where  a  assumes  values.  This  means  that  for  values  of  a  in  the  intervals  =  (0, 
1/4],  l2  =  (1/4,  1/2]  and  /3  —  (1/2,  1)  there  correspond  three  different  probability  of  rejection  func¬ 
tions  p, |(v),  p, j (v )  and  p,i(.v)  depending  only  on  v.  These  functions  have  range  [0,A|j,  [0,/>2]  and 
[0,//3]  respectively  where  h\  €  /3,  h2  €  l2  and  h}  €  /j.  The  functions  p,}(v)  and  pr}(v)  corresponding 
to  the  larger  values  of  a  are  concave  and  symmetric  around  v  -  1/2  where  they  achieve  their  maxima 
h2  and  h2.  As  a  gets  smaller,  in  this  case  a  €  / |,  the  function  p,(v)  begins  to  behave  in  a  different 
manner.  It  remains  symmetric  around  v  -  1/2  but  in  this  instance  becomes  bimodal.  It  jumps  to  a 
quicker  maximum  achieved  at  about  v  =  .3  (also  v  =  .7)  and  remains  fairly  close  to  its  maximum  for 
values  of  v  between  v  =  .3  and  v  *=  7.  These  functions  are  plotted  in  Appendix  C. 

For  values  of  m  >  1  the  number  of  nonoverlapping  subintervals  covering  (0,1]  where  a  may  be 
specified  is  given  by  2m  -  1.  The  rightmost  subinterval  always  has  length  l/2m~1.  All  others  have 
length  l/2m.  The  length  of  the  subintervals  decreases  rapidly  and  to  each  one  there  corresponds  a 
unique  probability  of  rejection  function  p,(v)  with  range  [0,A]  where  h  is  contained  in  the  subinterval 
where  a  assumes  its  value.  The  functions  p,(.v)  depend  only  on  v  and  are  symmetric  around  v  —  1/2. 
As  m  increases  h  —  a  and  it  appears  that  for  smaller  a’s  the  functions  pr(v)  tend  to  achieve  their  max¬ 
imum  rapidly  and  remain  close  to  this  maximum  for  values  of  v  between  the  modes  approximately  a 
rectangular  shape.  This  means  that  for  large  m  and  small  a  the  probability  of  false  rejection  is  approxi¬ 
mately  a  for  all  values  of  v  except  perhaps  those  close  to  0  or  1 .  This  behavior  supports  that  observed 
from  the  computer  evaluation.  In  this  case  as  m  becomes  large  and  a  becomes  small  the  probability  of 
rejection  approaches  a. 

Appendix  C  contains  plots  of  the  different  probability  of  rejection  functions  and  the  power  func¬ 
tions  for  the  case  m  =  2. 

SUMMARY 

Both  the  Monte  Carlo  evaluation  and  the  analytical  approach  indicate  that  the  statistical  test 
developed  by  Johnson  and  Wiener  is  a  valuable  tool  that  accomplishes  its  objective  as  a  technique  for 
validating  the  simulation  model.  In  evaluating  the  properties  of  the  test  several  questions  have  arisen 
that  remain  open  for  further  investigation.  However,  there  is  sufficient  evidence  to  suggest  that  the 
test  technique  is  a  good  discriminating  tool  that  may  be  applicable  to  simulation  models  resulting  in 
binary  time  series  output  in  general. 
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Appendix  A 

VALIDATION  METHODOLOGY 


The  simulation  model  in  question  was  designed  to  estimate  the  performance  of  an  acoustic  sensor 
in  the  detection  of  a  target  in  the  ocean.  The  target's  acoustic  characteristics  and  the  target  track,  i.e., 
the  history  of  the  geographical  positions  of  the  target  at  each  hour,  are  provided  as  input  to  the  simula¬ 
tion  model.  The  model  operates  by  replicating  this  track  many  times.  During  a  particular  replication, 
at  each  hour,  the  model  decides  that  the  sensor  either  is  detecting  or  is  not  detecting  the  target,  based 
on  factors  such  as  sensor-target  range  and  geometry;  the  acoustic  properties  of  the  ocean  in  the  sensor- 
target  vicinity;  sensor  alertment  due  to  possible  detections  at  previous  hours  of  this  replication;  and  the 
magnitude  of  a  Gauss-Markov  fluctuation  approximated  by  summing  three  independent  Ehrenfest  ran¬ 
dom  walk  terms  (Feller  (All)  at  each  hour.  Thus  for  a  target  track  lasting  m  hours,  a  single  replication 
by  the  model  produces  a  vector  of  m  elements,  each  element  being  either  1  or  0,  where  1  indicates  the 
sensor  is  detecting  and  0  not  detecting  the  target  at  a  particular  hour.  It  is  clear  that  a  high  degree  of 
correlation  exists  between  the  detection  events  occurring  at  consecutive  hours.  This  is  because  the  pro¬ 
bability  of  detection  of  a  given  target  by  a  given  sensor  is  a  strong  function  of  the  target’s  geographical 
position.  In  addition  there  is  an  alertment  effect,  because  of  prior  detections,  that  must  be  considered. 
Moreover,  it  would  be  most  unusual  for  the  same  probability  of  detection  to  obtain  at  all  points  of  a 
target  track.  Since  a  moving  target  is  changing  its  position  and  aspect  relative  to  the  sensor  as  time 
progresses,  no  steady  state  will  be  reached  in  the  finite  duration  of  a  target  track. 

The  recorded  history  of  detections  of  a  given  target  by  a  given  acoustic  sensor  must  be  considered 
as  a  sample  of  size  one  from  a  population  with  an  unknown  statistical  distribution.  It  is  a  random 
binary  vector  of  0's  and  l’s  with  m  elements.  A  factor  affecting  the  selection  of  the  validation  tech¬ 
nique  is  the  lack  of  data  concerning  realizations  of  the  simulated  process.  Multiple  realizations  of  the 
same  target  track  are  impossible  to  obtain  for  most  targets  and  sensors  of  interest.  Hence  one  is  testing 
the  validity  of  the  simulation  model  as  a  representation  of  the  statistical  structure  underlying  the 
recorded  history  of  detections.  That  is,  the  hypothesis  being  tested  is  that  the  unknown  statistical  dis¬ 
tributions  of  which  the  observed  history  of  detections  constitutes  a  single  sample  point  is  the  same  as 
the  unknown  statistical  distribution  of  which  the  models  replications  constitute  many  sample  points. 

As  a  particular  target  traverses  the  surveillance  zone  it  generates  a  track  history  of  detections  for 
each  sensor  in  the  zone.  For  each  sensor  i  there  is  an  observed  vector  x,  —  (x„,  ...  ,  xm)  of  0*s  and 
I’s  from  an  unknown  probability  distribution  p'0  and  the  simulation  generates  n  vectors  x  “ 
(X| . xm)  of  0's  and  l’s  from  a  probability  distribution  p,.  Using  the  n  generated  vectors  the  sta¬ 

tistical  technique  obtains  an  estimate  p,  of  pt.  The  simulated  data  are  then  applied  to  pt  to  obtain  the 
sample  distribution  of  the  n  vectors  as  an  approximation  to  the  population  distribution.  The  test  con¬ 
sists  of  determining  whether  the  observed  vector  x,  has  p, -value  in  the  upper  I  -  a  region  of  the  sam¬ 
ple  distribution.  If  it  does  then  the  hypothesis  that  x,  comes  from  the  distribution  p,  is  accepted.  The 
test  is  applied  for  all  i  and  many  acceptances  that  x,  is  from  p,  will  confirm  the  null  hypothesis  that  p,  is 
a  good  approximation  to  p„,  thereby  validating  the  simulation.  It  is  to  be  expected  that  due  to  statistical 
fluctuation  some  sensors  will  fail  the  test.  Hence  a  distinction  must  be  made  between  validation  of  the 
simulation  model  in  general  and  specific  statements  about  the  simulation  of  the  individual  sensors. 
Statements  about  the  latter  are  simply  understood  to  carry  the  uncertainty  inherent  in  the  statistical  test 
itself. 


The  statistical  hypothesis  test  uses  the  representation  by  Bahadur  |A2)  and  Lazarsfeld  (A3)  for  the 
probability  distribution  underlying  binary  sequences,  which  is  summarized  as  follows.  Assume  the  tar¬ 
get  track  is  m  hours  long.  Let  X  be  the  set  of  all  points  x  -  (xi,  x2 . xm)  with  each  x,  «■  0  or  I, 


11 


D.  R.  ROQUE 


and  suppose  pix)  is  a  probability  distribution  on  the  elements  of  X,  that  is,  pix)  ^  0  for  all  x  c  X  and 
Zuxp(x)  -  1.  Let  £,(  )  denote  the  expected  value  of  the  expression  in  parentheses  when  the  distri¬ 
bution  p  obtains.  Then  let 


v,  -  Ep(x,)  0  <  Vj  <  1;  / 
z,  -  ix,  ~  v,)/y/v,  •  (l  -  v,) 


Next  define  the  family 

',  -  EM  ■  z,) 
ruk  -  E„(Zi  '  */  '  **) 


i  <  i\ 
i<  )  <  k\ 


1,2,  ....  m;  and 
/  —  1,2,  ....  m. 


'12 ..m  “  Ep(Z\  ■  Z2  Zm). 

For  x  «  (X|,jcj,  ....  xm)  define 

piud)  -  n,*,  v(x'(i-v,)'~'' 

and 

fix)  1  +  *  2/  *  Zy  +  X/<  y<  j^yy*  *  Z/  *  Zy  *  zk  (1) 

+  ...  +  '12  *  •  z,  •  22  •••  •  Zm- 

Then  for  each  x  in  X, 

flix)  -  P\\\ix)fix).  (2) 


Thus  Piu(x)  denotes  the  joint  probability  distribution  of  the  x,'s  under  an  assumption  that  the  x,'s  are 
independently  distributed,  and  fix)  represents  the  effects  of  correlation. 

In  this  representation  it  is  natural  to  refer  to  the  parameters  r„  as  second  order  correlations,  to  the 
parameters  riik  as  third  order  correlations,  and  so  forth,  culminating  in  ru  m,  the  m-th  order  correla¬ 
tion.  The  distribution  p  then  is  said  to  have  order  s  if  one  correlation  of  order  s  is  non-zero  and  all 
correlations  of  order  greater  than  s  are  equal  to  zero.  If  a  distribution  is  known  to  be  of  a  certain  order 
s,  then  the  representation  (1)  need  only  extend  to  correlations  of  order  sor  less. 

In  some  applications  either  the  nature  of  the  situation  being  studied  or  computational  problems 
might  make  it  necessary  to  assume  a  specified  order  to  the  distribution,  even  though  the  value  of  that 
order  cannot  be  known  precisely.  If  the  selection  is  in  error,  then  the  "truncated”  form  of  expression 
(1)  will  be  in  error  and  so  will  the  resulting  values  of  f(x)  and  pix).  As  defined  by  (6),  the  estimated 
p(x)  may  not  be  a  probability  distribution  and  may  assume  negative  values  for  some  x.  This  point  is 
discussed  by  Bahadur  (A2).  In  addition,  to  obtain  the  estimate  p  of  (2)  one  must  first  obtain  the  esti¬ 
mate  7  of  (I).  The  statistical  fluctuation  associated  with  the  values  fix)  is  another  source  of  error 
leading  to  negative  values  of  fix)  for  some  x  and  consequently  for  pix).  Because  p  may  not  be  a  pro¬ 
bability  distribution,  values  taken  by  pare  referred  to  as  likelihood  values. 

In  the  application  to  the  APSURV  simulation  several  considerations  led  to  truncating  the  form  of 
(1).  A  fourth-order  approximation  to  the  Bahadur-Lazarsfeld  representation  has  been  employed,  trun¬ 
cating  the  expression  in  (1)  after  the  fourth-order  correlations  and  using  a  time  window  of  twelve 
Ijours,  that  is,  assuming  zero  correlation  between  time  steps  more  than  twelve  hours  apart.  From 
observations  of  the  detection  process  it  seemed  reasonable  to  assume  both  that  the  correlation  between 
epochs  separated  by  more  than  twelve  hours  is  negligible  compared  to  the  correlations  between  epochs 
closer  together,  and  also  that  the  contribution  of  correlations  of  order  greater  than  four  is  relatively 
insignificant.  The  truncation  has  also  been  necessary  in  order  to  keep  the  computer  costs  within  reason. 
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The  observed  realization  compared  with  the  simulation  results  consists  of  a  track  history  of  dura¬ 
tion  m  hours  together  with  the  associated  detection  history.  The  simulation  model  is  programmed  to 
run  with  input  parameters  characterizing  the  observed  situation.  Then  n  replications  of  the  model  are 
run,  each  producing  a  time  series  (vector)  x „  ~  (x,hx,2 . xrm),  g  “  1,2 . «,  of  m  binary  ele¬ 

ments,  where  1  denotes  a  detection  and  0  no  detection  of  the  target  by  the  acoustic  sensor.  Actual 
values  of  m  and  n  used  were  m  —  240  and  n  “  SO.  The  n  binary  vectors  are  used  to  estimate  the 
parameters  in  the  Bahadur- Lazarsfeld  representation  of  the  probability  distribution  corresponding  to  the 
population  from  which  the  n  sample  vectors  are  generated.  Simple  unbiased  estimators  were  chosen  for 
all  parameters.  The  estimators  for  the  v,’s  are  maximum  likelihood  when  the  distribution  Pm  obtains, 
that  is,  when  the  x„'s  are  independently  distributed.  Since  the  v,’s  are  assumed  to  be  neither  0  nor  I  a 
reasonable  correction  is  made  should  the  data  seem  to  indicate  they  are.  The  estimates  are  obtained  by: 


l/2/i  if 

1  -  (l/2/i)  if 

(l/nll/.ix,,  otherwise  for 


1**.  "  o 

i  -  1.2 . m; 


z„,  “  ( * ■»  ~  v,)/V V,  ■  (1  -  v,)  *  —  1.2,  ....  m. 


g  -  1.2 . a; 


(3) 

(4) 


and 

f</  ■  (1/a)  ■  zKi,  1  ^  i  <  )  4  ar; 

?,/*  -  <l/«>  •  zu  •  ***  l<i<j<k*m, 

'll ...»  “  (1  In)  i;.|Z,l  •  2,2  •  •  •  •  •  Itm 

The  likelihood  p,  of  the  ff-th  model  replication  x,  is  given  by 

P„  "  P(x„)  -  f(x,)  ■  in  -t  •  (1  -  v,)1’*"],  g  -  1.2 . a. 

with  f(ig)  as  characterized  by  (1)  and  the  z-vaiues  as  given  by  (4).  Then  under  the  hypothesis  that  the 
random  mechanism  for  the  simulation  is  a  model  for  the  random  mechanism  underlying  the  observed 
sensor's  detections,  the  recorded  sequence  of  detections  of  that  target  by  the  specified  sensor,  the 

binary  vector  x  *■  (xi,x2 . xm),  haa  likelihood  q  ”  p(x),  relative  to  the  Behadur-Lazarsfeld 

representation  of  the  a  model  replications  given  by  (2)  using  the  Ts  and  T*  computed  by  (3)  and  (S) 

from  the  model  replications.  Once  the  numbers  q  ™  p(x)  and  (p,  -  p(x,):  f  “  1.2 . a)  have 

been  obtained,  it  can  be  determined  whether  to  accept  the  simulation  model  as  valid  at  a  significance 
level  a.  The  test  procedure  (a  straightforward  rank  test)  is  to  reject  the  hypothesis  of  association  if  the 
observed  value  q  falls  below  the  a-th  percentile  of  computed  values  p(.  Define  N  to  be  the  number  of 
elements  in  the  set  Ip,  :p,  1  <  g  <  a).  Then  if  N  >  aa,  the  simulation  model  is  determined  to 

be  valid  in  predicting  the  performance  of  the  specified  acoustic  sensor  in  detecting  that  target.  The 
model  is  rejected  as  not  valid  at  significance  level  a  if  S  <  na.  The  reason  a  one-sided  teat  is  used 
here  is  that  the  higher  the  likelihood  of  the  recorded  detection  history  relative  to  the  Bahadur- 
Lazarsfeld  representation  of  the  model  replications,  the  better  the  agreement  is  between  the  recorded 
detection  history  and  the  simulation  output.  Hence  one  need  only  be  concerned  with  rejecting  the 
simulation  model  if  the  likelihood  of  the  recorded  detection  sequence  is  low  relative  to  the  likelihoods 
of  the  model  replications. 

Finally  it  should  be  mentioned  that  although  Johnson  and  Wiener  |A4]  could  not  know  the  struc¬ 
ture  of  the  probability  distribution  they  were  working  with,  in  particular  the  exact  structure  of  the 
correlation  function,  they  did  address  the  problem  of  establishing  the  adequacy  of  the  number  of  repli¬ 
cations  used  (values  of  />)  via  a  Smirnow  two-sided  goodness-of-fit  test.  The  distribution  of  likelihoods 
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(relative  to  their  respective  Bahadur  representations)  of  three  independent  sets  of  fifty  replications  for 
the  same  target  and  acoustic  sensor  were  generated.  The  Smirnov  two-sided  test  for  goodness-of-At  (at 
the  0.20  signiAcance  level)  indicated  that  the  three  samples  could  be  assumed  to  have  come  from  the 
same  population.  From  this  it  was  concluded  that  Afty  replications  of  the  model  are  sufficient  for  vali¬ 
dation  purposes.  In  a  similar  manner  three  independent  sets  of  twenty  replications  each  for  the  same 
target  and  acoustic  sensor  were  generated.  Although  these  samples  passed  the  Smirnov  two-sided 
goodness-of-At  test  at  the  0.10  signiAcance  level,  at  the  0.20  signiAcance  level  the  test  indicated  the 
three  samples  did  not  come  from  the  same  population.  From  this  it  was  concluded  \that  the  variability 
between  samples  of  only  20  replications  was  too  great  to  permit  their  used  in  a  validation. 
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Appendix  B 

MONTE  CARLO  EVALUATION 


In  this  appendix  a  detailed  presentation  is  given  of  the  data  collected  in  performing  an  evaluation 
of  the  test  technique  developed  by  Johnson  and  Wiener  to  validate  the  APSURV  MOD  1.4  simulation 
model. 


Three  factors  were  investigated,  first  the  ability  of  the  test  to  generate  a  probability  of  false  rejec¬ 
tion  a  which  approximate  the  assumed  specified  value  a;  second  the  influence  of  computed  values  f(x ) 
of  the  correlation  term  /(x)  in  Eq.  (6)  of  appendix  A;  and  third  the  power  of  the  test  in  correctly 
rejecting  vectors  which  did  not  come  from  the  same  population  as  did  the  original  set. 

The  approach  taken  used  sets  of  a  +  1  random  binary  vectors  which  were  repeatedly  generated 
from  known  distributions.  Several  cases  were  considered  by  varying  the  vector  length  m  and  by  varying 
u  for  each  m.  The  vector  elements  in  all  cases  were  independent  Bernoulli  random  variables  where  the 
probability  of  a  1  varied  by  vector  element.  The  various  cases  of  m  considered  were  2,  3,  4,  5  and  6 
vector  elements.  For  the  case  m  —  6  the  probabilities  of  a  I  for  all  vector  elements  were  (.2,  .4,  .6,  .8, 
.8,  .8)  respectively.  Similarly  one  used  for  at  —  5  the  probability  vector  (.2,  .4,  .6,  .8,  .8),  for  ai  ->  4 
(.2,  .4,  .6,  .8),  for  m  -  3  (.2.  .4,  .6)  and  for  m  —  2  (.2.  .4).  For  all  of  these  distributions  the  correla¬ 
tion  function  has  a  theoretical  value  of  /(x)  —  I.  Two  sets  of  data  were  obtained.  One  set  used  the 
theoretical  values  fix)  -  1  the  other  the  computed  estimates  /(x).  For  the  case  m  —  5  tests  were 
also  conducted  where  the  a+  I-st  vector  was  generated  from  a  known  distribution  other  than  the  distri¬ 
bution  from  which  the  first  a  vectors  were  generated.  This  alternate  distributions  consisted  of  indepen¬ 
dent.  identically  distributed  vector  elements  where  the  probability  of  a  I  was  given  by  p  »  .1  in  one 
case  and  p  -  .5  in  another. 

In  a  typical  case,  for  example  m  -  3  and  a  —  20,  with  a  random  number  subroutine  21  vectors  of 
length  3  are  generated  and  the  test  applied  to  determine  whether  the  2 I-st  vector  belongs  to  the  same 
population  from  which  the  first  20  come.  The  decision  is  made  to  reject  or  not  to  reject  this  hypothesis. 
The  procedure  is  repeated  20  times  and  the  proportion  of  rejections  is  computed.  In  turn  the  sets  of  20 
are  repeated  100  times.  The  one  hundred  computed  proportion  of  rejections  are  then  treated  as  100 
independent,  identically  distributed  random  variables  whose  mean  is  an  estimate  of  the  probability  of 
false  rejection  resulting  from  the  application  of  the  lest.  When  the  a-M-st  vector  is  generated  from  a 
specified  alternate  distribution  the  proportion  of  rejections  in  this  case  estimates  the  power  of  the  lest, 
that  is,  the  probability  that  the  null  hypothesis  is  correctly  rejected. 

Tables  B.l  and  B.2  summarize  the  data  collected  in  the  Monte  Carlo  evaluation  of  the  test.  Table 
B.l  consists  of  values  obtained  with  fix)  —  l  and  Table  B.2  the  values  obtained  with  /(x)  —  /(x),  as 
computed  from  the  data.  As  usual,  m  denotes  the  vector  length  and  a  the  number  of  replications  from 
which  the  Bahadur-Lazarsfeld  representation  is  obtained  for  each  distribution.  Each  case  (m,  a,  I)  or 
(ai,  n.  /)  yields  an  estimate  a.  This  estimate  is  the  average  proportion  of  rejections  computed  from  k 
subsets  each  of  size  z.  The  product  k  x  z  represents  the  total  number  of  times  the  statistical  test  is  per¬ 
formed  for  each  case  (m,  a,  1)  or  (at,  a,  J)  The  computed  sample  standard  deviation  of  the  estimate 
is  denoted  s.  The  factor  s/Vk  is  used  to  obtain  confidence  intervals  for  a  using  percentage  points  of 
the  Student-i  distribution.  For  each  vector  length  m  the  size  of  the  population  of  such  vectors  is  2" 
possible  binary  vectors.  In  order  to  obtain  a  uniform  measure  of  the  relative  size  of  the  number  of 
replications  from  which  the  Bahadur-Lazarsfeld  representation  is  obtained  for  each  case  (a,  at),  the 
ratio  a/ 2"  is  used.  These  ratios  are  listed  along  with  the  data  in  Tables  B1  and  B2.  In  Table  B2  the 
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Table  Bl  —  Test  specified  for  a  —  0.3 
Data  obtained  with  fix)  -  1 


Vector 

Length 

m 

No.  of  Subsets 
size  of  each 
subset 
k/z 

Number  of 
replication 
per  case 

n 

in/ 2") 

Estimate 
of  a 

a 

Sample 

standard 

deviation 

s 

2 

20 

.1900 

09S 

2 

so 

12.5 

.2015 

.0963 

2 

100/20 

100 

25 

.1940 

.0799 

2 

100/20 

200 

so 

.1910 

.osso 

3 

100/20 

20 

2.5 

.2655 

.0892 

3 

100/20 

50 

6.25 

.2765 

.1011 

3 

100/20 

100 

12.5 

.2795 

.1054 

3 

100/20 

200 

25 

.2770 

.0968 

3 

40/10 

400 

so 

.2300 

1436 

n 

100/20 

20 

1.25 

.2935 

.1079 

100/20 

so 

3.125 

.2770 

.0988 

1 

100/20 

100 

6.25 

.2670 

.0972 

100/20 

200 

12.5 

.2920 

.1167 

Bl 

40/10 

400 

25 

.2725 

.1281 

5 

100/20 

20 

.625 

.3395 

1142 

5 

100/20 

50 

1.56 

3090 

.1021 

5 

100/20 

too 

3.125 

.3280 

.1081 

5 

100/20 

200 

6.25 

.2985 

0981 

5 

40/10 

400 

12.5 

2800 

1652 

5 

20/10 

§00 

25 

2900 

1619 

n 

100/20 

100 

1.56 

.2975 

.1003 

mm 

100/20 

200 

3.125 

3095 

.1053 

■SI 

40/10 

400 

6.25 

.2700 

.1381 

mm 

20/10 

§00 

12.5 

.3150 

.1424 

number  of  the  vectors  resulting  with  negative  likelihoods  for  each  run  is  denoted  «  and  the  number  of 
these  which  were  the  observed  vector  is  denoted  Figure  Bl  presents  a  comparison  of  estimates  a 
as  a  function  of  (n/2m)  for  fix)  —  1  and  fix)  -  /  where  m  “  5.  It  can  be  observed  that  as  in/ 2') 
increases  the  estimates  a  for  the  case  fix)  -  / approach  the  estimates  a  for  the  case  fix)  —  I,  both 
approaching  the  region  a-  .3.  In  the  case  fix)  -  /  this  is  accompanied  by  a  decrease  in  the  resulting 
number  of  negative  vectors,  as  is  indicated  in  Table  B2.  Based  on  a  comparison  with  the  case 
fix)  -  1,  the  values  n/ 25  where  the  estimates  a  for  the  case  fix)  -  /  are  considered  adequate  are 
6.2$,  I2.S  and  25.  The  resulting  negative  vectors  in  these  cases  were  w  ■  7,  m  *  4  and  m  “  I  respec¬ 
tively  as  compared  tow  -  273,  w  -  121  and  m  -  65  for  the  cases  in/ 25)  -  .625,  in/ 2’)  -  1.56  and 
(n/25)  “  3.125  respectively.  It  should  be  noted  that  for  these  last  three  cases  the  value  of  k  x  *  is 
equal  to  2000  that  is,  a  total  of  2000  sample  points,  whereas  for  the  previous  three  cases  there  were 
only  1200,  400  and  200  sample  points  ik  x  z)  respectively.  A  simple  calculation  yields  7/1200  —  .006, 
4/400  ■>  .01  and  1/200  *  .005  negative  vectors  per  sample  point.  Multiplying  by  2000  yields  11.67,  20 
and  10  negative  vectors  respectively,  still  a  small  number  compared  to  the  other  three  cases  hence  the 
number  of  negative  vectors  does  decrease  as  n  increases. 
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Table  B2  —  Results  of  Simulation  Experiments 
Test  specified  for  a  •  0.3 
Data  Obtained  with  fix)  «■  f 


Vector 

Length 

m 

No  of  Subsets 
Sue  of  each 
subset 

k/z 

Number  of 
replica  lions 
per  case 

ii 

<«/2*i 

Estimate 
of  a 

it 

Sample 

Standard 

Deviation 

* 

Vectors  with 
negative 
likelihood 

w 

Number  of  Observed 
vectors  with  negative 
likelihood 

»o 

2 

100/20  ' 

—  20 

5 

1950  ' 

r  0973 

n 

II 

2 

100/20 

50 

12  5 

2030 

0982 

0 

0 

2 

100/20 

100 

25 

1945 

0794 

0 

0 

2 

100/20 

200 

50 

1910 

0880 

0 

0 

3 

100/20 

20 

25 

2980 

0951 

18 

18 

3 

100/20 

50 

A  25 

2835 

09S9 

0 

0 

3 

100/20 

100 

12  5 

2795 

0985 

0 

0 

3 

100/20 

200 

25 

2795 

0980 

0 

0 

3 

40/10 

400 

50 

2600 

1464 

0 

0 

4 

100/20 

20 

1  25 

3895 

0936 

69 

69 

4 

100/20 

50 

3  125 

3130 

1058 

0 

0 

4 

100/20 

100 

625 

2855 

1003 

0 

0 

4 

100/20 

200 

12  5 

3060 

1162 

o 

0 

4 

40/10 

400 

25 

2925 

1403 

0 

0 

S 

100/20 

20 

A25 

4785 

1231 

273 

273 

5 

100/20 

50 

1  56 

3715 

1179 

121 

121 

5 

100/20 

too 

3  125 

3600 

1094 

65 

65 

5 

60/20 

200 

6  25 

3233 

1006 

7 

7 

5 

40/10 

400 

12  5 

3075 

1607 

4 

1 

5 

20/10 

800 

25 

2850 

1631 

1 

1 

6 

100/20 

100 

1  56 

3835 

1010 

113 

113 

6 

50/20 

200 

3  125 

3440 

1067 

31 

27 

6 

40/10 

400 

A  25 

3000 

1468 

14 

4 

A 

10/10 

800 

12  5 

3400 

1578 

9 

1 

Vectors  with  negative  likelihoods  indicate  an  unstable  correlation  function.  As  the  number  of 
rows  (n)  increases  the  correlation  function  begins  to  settle  at  about  its  theoretical  value  1  and  the 
number  of  negative  likelihood  vectors  decreases.  The  larger  the  number  of  outcomes  (2")  of  the  popu¬ 
lation  from  which  the  distribution  is  estimated,  the  larger  the  number  of  rows  (/»)  must  be  for  the  nega¬ 
tive  values  to  begin  to  disappear.  The  reason  is  illustrated  as  follows: 

Consider  the  term 

L 

«</ 

The  number  of  additions  or  the  cardinality  of  the  set 

|(t»  :  /  <  J.  i  -  I . m  -  1 ;  J  -  2 . m) 

increases  as  m  increases.  The  same  happens  for  higher  order  correlations.  Even  as  r,7  -*  0  fast  the 
increased  number  of  additions  counteracts  this  effect  somewhat  as  m  increases.  Hence  for  larger  m't 
the  rale  of  decrease  of  vectors  with  negative  likelihoods  decreases  even  though  the  actual  number  of 
negative  vectors  continues  to  decrease  as  n  increases. 
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In  the  procedure  used  whenever  the  observed  in  +  Dst  vector  had  a  negative  likelihood  it  was 
assigned  a  very  small  probability  and  was  automatically  rejected.  The  overall  rejection  criterion  was 
fixed  in  all  cases  at  a  -  .3.  The  sample  estimate  of  the  probability  of  rejection  a  is  very  sensitive  to 
the  correlation  function  effects.  Two  effects  can  be  distinguished:  first,  correlation  term  values  that 
deviate  significantly  from  fix)  —  1  tend  to  misrepresent  the  probability  distribution;  second,  this  same 
fluctuation  generates  a  large  number  of  negative  observed  vectors  which  inflates  the  proportion  of  rejec¬ 
tions  as  such  vectors  are  automatically  rejected.  Except  for  large  ratios  in/ 2m)  a  large  proportion  of 
vectors  with  negative  likelihood  turn  out  to  be  the  observed  vector.  This  is  the  vector  not  included  in 
obtaining  the  Bahadur  representation.  A  possible  explanation  is  that  the  Bahadur  representation  tends 
to  represent  its  source,  independent  of  other  properties  of  the  test  this  is  a  desirable  property  in  the 
sense  that  the  test  serves  its  purpose  as  a  discriminating  tool.  All  of  these  effects  subside  simultane¬ 
ously  isnor  in/ 2m)  increase. 
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In  each  Monte  Carlo  run  to  estimate  a,  the  first  n  x  m  matrix  of  0's  and  l’s  was  printed  along 
with  the  corresponding  value  of  fix)  for  each  row  vector.  Relative  frequency  histograms  of  the  n 
values  fix)  were  plotted  for  each  case  (m,  n).  These  plots  revealed  the  expected  result  that  as  n 
increases  for  a  specific  m,  the  spread  of  values  of  fix)  go  towards  a  spike  at  1.  An  example  is  the  case 
hi  *  5  and  n  ~  S00  plotted  in  Figure  B4.  The  mean  value  is  fix)  ~  1.037  with  standard  deviation 
0.251.  Figure  B5  shows  a  plot  of_ values  of  fix)  for  m  —  5  as  a  function  of  values  n  —  20,  50,  100, 
200,  400  and  800.  The  values  fix)  consistently  overshoot  their  theoretical  value  fix)  —  1.  For 
values  of  n  greater  than  or  equal  to  200  the  estimates  are  deemed  close  enough  for  an  adequate  approx¬ 
imation  to  the  probability  distribution  they  attempt  to  represent. 


Finally  the  power  of  the  test  was  considered.  The  same  Monte  Carlo  procedure  was  applied  where 
the  observed  vector  was  now  generated  from  an  alternate  distribution.  Two  alternate  distributions  were 
considered.  The  first  one  was  one  in  which  all  vector  elements  were  independent,  identically  distri¬ 
buted  (iid),  Bernoulli  random  variables  with  p  -  .1  and  the  second  one  was  one  where  all  vector  ele¬ 
ments  were  again  iid,  Bernoulli  random  variables  with  p  —  .5.  For  each  of  these  distributions  a  Monte 
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Fig.  B5  —  Mean  values  fix)  as  a  funciion  of  n 

Carlo  run  was  conducted  first  using  fix)  —  1  and  then  using  fix)  -  fix).  The  computed  proportions 
now  represent  the  power  of  the  test.  The  case  tried  was  m  -  5  and  n  —  200.  Not  only  did  the  test 
appear  to  discriminate  well  but  also  no  appreciable  difference  was  noted  between  the  cases  fix)  -  1 
and  fix)  -  fix).  Figures  B6,  B7,  B8  and  B9  show  plots  of  the  relative  frequency  histograms  resulting 
in  estimating  1-/9  the  power  of  the  test  for  all  cases  considered,  where  j3  is  the  probability  of  a  type  11 
error,  that  is,  of  incorrectly  accepting  the  (n+D-st  vector.  The  figures  show  lower  power  for  the  case  p 
“  0.5  as  opposed  to  p  —  0.1.  For  p  —  0.5  the  power  is  approximately  0.70;  for  p  —  0.1  it  is  approxi¬ 
mately  0.90.  This  is  as  it  should  be  since  p  —  0.1  is  "farther"  from  the  simulated  case  than  p  **  0.5, 
which  was  closer  to  the  simulated  probability  vector  (0.2,  0.4,  0.6,  O.S,  0.8).  The  lest  for  this  two  cases 
discriminated  very  well. 

The  Monte  Carlo  evaluation  indicates  that  a  is  approximately  a  for  both  m  and  n  large,  that  /gets 
close  to  fix)  -  1  as  n  increases  and  th  'he  power  of  the  test  increases  as  "differences"  between  the 
base  case  and  the  (w-H)-st  vector  increase. 


Appendix  C 

DETAILS  OF  ANALYTICAL  FORMULATION 


The  particular  data  structure  considered  in  the  analytical  formulation  of  the  problem  shows  that  it 
may  be  the  case  that  the  probability  of  false  rejection  where  applying  the  test  approaches  the 
significance  level  a  as  m  the  vector  length,  increases.  For  any  vector  length  m  there  are  2m  possible 
binary  outcomes  or  members  of  the  population  of  vectors  of  length  m.  The  analysis  consists  of  letting 
n  —  2m  where  all  binary  vector  elements  consist  of  independent,  identically  distributed  Bernoulli  ran¬ 
dom  variables  with  the  probability  of  a  1  given  by  v.  There  are  (2m)2m  possible  outcomes  from  which 
the  Bahadur  representation  may  be  obtained.  To  each  of  these  outcomes  one  may  associate  2"'  possible 
( n  +  1)— si  or  observed  vector  yielding  a  total  of  (2'")2”’+l  elementary  outcomes.  The  case  where  the 
observed  vector  comes  from  the  same  distribution  as  the  set  of  first  n  =  2"‘  vectors  is  considered  first. 
Specifying  a,  evaluating  N  for  each  outcome,  and  applying  the  rules  of  the  test  one  may  establish  an 
association  between  a  and  the  actual  probability  of  false  rejection.  Both  the  cases  m  =  1  and  m  =  2 
were  discussed  in  section  5.  This  appendix  presents  further  results  on  the  analytical  approach.  It  deals 
specifically  with  the  case  m  =  2.  Applying  the  rules  of  the  test  by  direct  enumeration  one  obtains  that 
the  possible  values  of  N  are  either  0,  1,2,  or  4.  Here  the  test  indicates  that  for  a  €  (0, 1/4]  one  rejects 
the  null  hypothesis  if  and  only  if  N  =  0,  for  a  €  (1/4, 1/2]  one  rejects  if  and  only  if  N  <  1  (N  =  0  or 
1)  and  for  a  €  (1/2,1)  one  rejects  if  and  only  if  N  ^  2  (jV  =  0,  1  or  2).  The  case  a  =  1  is  not  realis¬ 
tic  and  should  be  ignored  as  a  triviality.  By  allowing  the  ( n  +  l)-sr  vector  to  come  from  an  alternate 
distribution  in  this  case  each  vector  element  a  Bernoulli  distributed  independent  random  variable  with 
probability  of  a  1  given  by  t  one  may  then  analyze  the  resulting  probability  of  rejecting  the  null 
hypothesis  correctly,  that  is,  the  power  of  the  test.  Consider  the  analysis  for  the  case  m  =  2.  In  this 
case  there  are  256  outcomes  from  which  the  Bahadur  Lazarsfeld  representation  may  be  obtained.  To 
each  of  these  outcomes  one  may  associate  4  possible  observed  vectors  yielding  a  total  of  1024  elemen¬ 
tary  outcomes  to  be  considered.  These  were  listed  in  detail  and  the  functions  pr>(v),  pr  (v)  and  p,t(v) 
were  obtained  along  with  their  corresponding  power  functions  1  -  j8t,  1  -  02  and  1  -  /3,.  The  value  of 
N  may  be  0,  1,  2  or  4,  yielding  three  intervals  in  which  a  may  assume  values,  l\  —  (0,1/4],  /2  = 
(l/4,l/2]  and  /3  —  (1/2,1].  To  each  of  these  intervals  correspond  one  pr(v)  function  and  one  1  -  j8 
function  as  indicated  by  the  respective  indices.  The  functions  pr(v)  are  of  the  form 

p,(v)  -  £  o,  vl°-'(l  -  v>)' 

/-i 

where  the  a/s  are  positive  integers  depending  on  the  number  of  outcomes  corresponding  to  each  case 
or  to  possible  values  of  N.  The  functions  are  symmetric  such  that  for  each  term  a  vx(  1-  v)y,  x  ^  y 
there  corresponds  a  term  b  l-v)*  where  a  -  b.  The  probability  of  rejection  functions  and  the 
power  functions  for  the  case  m  -  2  may  be  obtained  by  listing  the  1024  elementary  outcomes  and  their 
corresponding  values  of  N.  One  may  then  single  out  those  events  that  correspond  to  a  rejection  of  the 
null  hypothesis  for  each  of  the  three  subintervals  where  a  assumes  its  values.  The  events  for  which 
N  -  0  are  rejection  events  for  a  t  (0,  1/4],  the  events  for  which  N  -  0  or  1  are  rejection  events  for 
a  «  (1/4,  1/2],  and  the  events  for  which  N  -  0,  1  or  2  are  rejection  events  fora  c  (1/2,  1).  One  asso¬ 
ciates  to  each  event  its  corresponding  probability  and  add  the  event  probabilities  corresponding  to  each 
subinterval  where  a  assumes  values.  Table  Cl  lists  the  probability  of  false  rejection  functions  pr,(v), 
pfJ(v)  and  Pr3(,v).  The  first  column  lists  the  a;  accompanying  each  term  for  the  function  pM(v)  which 
corresponds  to  all  cases  N  —  0  or  a  €  (0, 1/4].  The  second  column  lists  all  numbers  of  terms  that 
must  be  added  to  p,,(v)  to  complete  or  form  the  function  prj(v)  which  corresponds  to  the  cases  N  -  0 
or  N  -  1  (i.e.,  a  €  (1/4, 1/2]).  The  last  column  lists  all  number  of  terms  that  must  be  added  to  p,:(v) 
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Table  Cl  —  Functions  prf(v),  /»r,(v)  and  pr)(v) 
Each  p,(v)  =  £  a, v10  ' '  (1  -  v)' 


Term 

a i  terms 
for 

«€  (0,1/41 

additional 
a,  terms 
for 

t»6(l/4.1/2l 

additional 
a,  terms 
for 

q€  (1/2,  ll 

vv(l —  v) 

H  2 

— 

— 

v8(l-v)J 

9 

8 

— 

r7(l-v)3 

20 

16 

24 

v6(l-v)4 

34 

60 

24 

v5( I  —  v) 5 

42 

102 

— 

P4(l-v)6 

34 

60 

24 

v3(l-v)7 

20 

16 

24 

v2(I— v)8 

9 

8 

— 

v(l-v)’ 

2 

- 

— 

to  complete  or  form  the  function  pr)(v)  which  corresponds  to  the  cases  N  —  0,  1  or  2  (i.e., 
a  €  (1/2, 1]).  Another  way  to  interpret  Table  Cl  is  that  column  1  consists  of  all  probabilities  for  which 
N  -  0  column  2  consists  of  all  probabilities  for  which  N  -  I  and  column  3  consists  of  all  probabilities 
for  which  /V  -  2.  Table  C2  lists  the  power  functions  I  —  /3|,  1-/32  and  1  -  /33  in  a  manner  similar  to 
Table  Cl.  The  power  function  1  -  /3  consists  of  sums  of  terms  of  the  form 

a  v'O  -  vYi'il  -  tY 

where  x  +  y  =  8  and  r  +  s  —  2;  x.j'  **  0,1,2 . 8  and  r,s  -  0, 1. 2 

The  function  p,((v)  has  range  [0,  .1731,  p,}(v)  has  range  [0,  .430]  and  p,,(v)  has  range  [0,  .523). 
Their  curves  are  plotted  in  Fig.  Cl.  The  power  curve  l  —  /3t,  1  —  /32  and  1  -  are  plotted  in  Fig.  C2 
where  v  =  .10.  Figure  C2  indicates  that  the  power  of  the  test  behaves  as  it  is  supposed  to  for  most  of 
its  range.  The  more  dissimilar  the  values  of  v  and  i  the  higher  the  power  of  the  test.  Figure  Cl  illus¬ 
trates  the  beginnings  of  the  conjectured  behavior  for  pr(v)  as  m  increases  and  a  becomes  small.  The 
case  a  €  /),  should  be  of  particular  interested  to  the  reader  in  relationship  to  the  conclusions  reached 
in  the  text  section  of  this  report. 
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Table  C2  —  Individual  Terms  for  the 
Power  Functions  1 -/J | ,  1  -02  and  1  -/3j. 

a  terms  additional  additional 
rm  for  a  terms  a  terms 

for  for 

a 6(0, 1/4]  a€  (1/4, 1/2]  a€  (1/2,1) 


v*r(l  - 1) 
v8(l  -  i)1 
v7(l  —  v)f2 
v7(l  -  v)l(l-t) 
V7(l  —  v)(l  —  r)2 
v6(l-  v)¥ 
v6(l  —  v)2/(I  —  /) 
y6(l-y)2(l-/)2 
v5(I  —  v)3/2 
v5(l  —  v)3r(l  —  /) 
V5(I  -  y)3(l  -  r)2 
v4(l  —  v)4/2 
v4(l -  v)4/(l  -  /) 
v4(l-y)4(l-r)2 
v3(l  —  v)V 

V3(l  -  v)sr(l  —  /) 
v3 ( 1  —  v ) 5 ( l  —  r)2 
v2(l  -  v)6/2 
V2(l-  v)6r(l-r) 
v2(l  -  y)6(l  -  r)2 
v(l-y)7/2 

V  ( 1  —  v)7f  (1  —  /) 
y(l  —  v)7(l  —  f)2 
(1  -  v)8f2 
(1- v)*/(l  -  /) 


—  Power  functions  (nr  —  2  , 


